Dataset statistics
| Number of variables | 7 |
|---|---|
| Number of observations | 768 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 48.0 KiB |
| Average record size in memory | 64.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Categorical | 1 |
Reproduction
| Analysis started | 2024-05-24 14:56:48.840648 |
|---|---|
| Analysis finished | 2024-05-24 14:56:57.472001 |
| Duration | 8.63 seconds |
| Software version | ydata-profiling vv4.7.0 |
| Download configuration | config.json |
Glucose
Real number (ℝ)
| Distinct | 47 |
|---|---|
| Distinct (%) | 6.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 72.39761 |
| Minimum | 24 |
|---|---|
| Maximum | 122 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.0 KiB |
Quantile statistics
| Minimum | 24 |
|---|---|
| 5-th percentile | 52 |
| Q1 | 64 |
| median | 72.119492 |
| Q3 | 80 |
| 95-th percentile | 90 |
| Maximum | 122 |
| Range | 98 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 12.096396 |
|---|---|
| Coefficient of variation (CV) | 0.16708281 |
| Kurtosis | 1.0980642 |
| Mean | 72.39761 |
| Median Absolute Deviation (MAD) | 7.8805085 |
| Skewness | 0.13918694 |
| Sum | 55601.364 |
| Variance | 146.32279 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=47)
| Value | Count | Frequency (%) |
| 70 | 57 | 7.4% |
| 74 | 52 | 6.8% |
| 68 | 45 | 5.9% |
| 78 | 45 | 5.9% |
| 72 | 44 | 5.7% |
| 64 | 43 | 5.6% |
| 80 | 40 | 5.2% |
| 76 | 39 | 5.1% |
| 60 | 37 | 4.8% |
| 72.23898305 | 35 | 4.6% |
| Other values (37) | 331 |
| Value | Count | Frequency (%) |
| 24 | 1 | 0.1% |
| 30 | 2 | 0.3% |
| 38 | 1 | 0.1% |
| 40 | 1 | 0.1% |
| 44 | 4 | 0.5% |
| 46 | 2 | 0.3% |
| 48 | 5 | 0.7% |
| 50 | 13 | |
| 52 | 11 | |
| 54 | 11 |
| Value | Count | Frequency (%) |
| 122 | 1 | 0.1% |
| 114 | 1 | 0.1% |
| 110 | 3 | |
| 108 | 2 | |
| 106 | 3 | |
| 104 | 2 | |
| 102 | 1 | 0.1% |
| 100 | 3 | |
| 98 | 3 | |
| 96 | 4 |
BloodPressure
Real number (ℝ)
| Distinct | 186 |
|---|---|
| Distinct (%) | 24.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 154.95509 |
| Minimum | 14 |
|---|---|
| Maximum | 846 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.0 KiB |
Quantile statistics
| Minimum | 14 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 121.5 |
| median | 154.33025 |
| Q3 | 154.33025 |
| 95-th percentile | 293 |
| Maximum | 846 |
| Range | 832 |
| Interquartile range (IQR) | 32.830247 |
Descriptive statistics
| Standard deviation | 85.02329 |
|---|---|
| Coefficient of variation (CV) | 0.54869632 |
| Kurtosis | 15.268338 |
| Mean | 154.95509 |
| Median Absolute Deviation (MAD) | 3.6697531 |
| Skewness | 3.039833 |
| Sum | 119005.51 |
| Variance | 7228.9599 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 154.3302469 | 374 | |
| 105 | 11 | 1.4% |
| 140 | 9 | 1.2% |
| 130 | 9 | 1.2% |
| 120 | 8 | 1.0% |
| 100 | 7 | 0.9% |
| 94 | 7 | 0.9% |
| 180 | 7 | 0.9% |
| 135 | 6 | 0.8% |
| 115 | 6 | 0.8% |
| Other values (176) | 324 |
| Value | Count | Frequency (%) |
| 14 | 1 | 0.1% |
| 15 | 1 | 0.1% |
| 16 | 1 | 0.1% |
| 18 | 2 | |
| 22 | 1 | 0.1% |
| 23 | 2 | |
| 25 | 1 | 0.1% |
| 29 | 1 | 0.1% |
| 32 | 1 | 0.1% |
| 36 | 3 |
| Value | Count | Frequency (%) |
| 846 | 1 | |
| 744 | 1 | |
| 680 | 1 | |
| 600 | 1 | |
| 579 | 1 | |
| 545 | 1 | |
| 543 | 1 | |
| 540 | 1 | |
| 510 | 1 | |
| 495 | 2 |
Insulin
Real number (ℝ)
| Distinct | 248 |
|---|---|
| Distinct (%) | 32.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32.455956 |
| Minimum | 18.2 |
|---|---|
| Maximum | 67.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.0 KiB |
Quantile statistics
| Minimum | 18.2 |
|---|---|
| 5-th percentile | 22.235 |
| Q1 | 27.5 |
| median | 32.352224 |
| Q3 | 36.6 |
| 95-th percentile | 44.395 |
| Maximum | 67.1 |
| Range | 48.9 |
| Interquartile range (IQR) | 9.1 |
Descriptive statistics
| Standard deviation | 6.8751627 |
|---|---|
| Coefficient of variation (CV) | 0.21183054 |
| Kurtosis | 0.9199919 |
| Mean | 32.455956 |
| Median Absolute Deviation (MAD) | 4.5522241 |
| Skewness | 0.59890908 |
| Sum | 24926.174 |
| Variance | 47.267862 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 32 | 13 | 1.7% |
| 31.2 | 12 | 1.6% |
| 31.6 | 12 | 1.6% |
| 32.35222405 | 11 | 1.4% |
| 33.3 | 10 | 1.3% |
| 32.4 | 10 | 1.3% |
| 32.8 | 9 | 1.2% |
| 30.1 | 9 | 1.2% |
| 32.9 | 9 | 1.2% |
| 30.8 | 9 | 1.2% |
| Other values (238) | 664 |
| Value | Count | Frequency (%) |
| 18.2 | 3 | |
| 18.4 | 1 | 0.1% |
| 19.1 | 1 | 0.1% |
| 19.3 | 1 | 0.1% |
| 19.4 | 1 | 0.1% |
| 19.5 | 2 | |
| 19.6 | 3 | |
| 19.9 | 1 | 0.1% |
| 20 | 1 | 0.1% |
| 20.1 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 67.1 | 1 | |
| 59.4 | 1 | |
| 57.3 | 1 | |
| 55 | 1 | |
| 53.2 | 1 | |
| 52.9 | 1 | |
| 52.3 | 2 | |
| 50 | 1 | |
| 49.7 | 1 | |
| 49.6 | 1 |
BMI
Real number (ℝ)
| Distinct | 136 |
|---|---|
| Distinct (%) | 17.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 120.89453 |
| Minimum | 0 |
|---|---|
| Maximum | 199 |
| Zeros | 5 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 79 |
| Q1 | 99 |
| median | 117 |
| Q3 | 140.25 |
| 95-th percentile | 181 |
| Maximum | 199 |
| Range | 199 |
| Interquartile range (IQR) | 41.25 |
Descriptive statistics
| Standard deviation | 31.972618 |
|---|---|
| Coefficient of variation (CV) | 0.26446703 |
| Kurtosis | 0.64077982 |
| Mean | 120.89453 |
| Median Absolute Deviation (MAD) | 20 |
| Skewness | 0.1737535 |
| Sum | 92847 |
| Variance | 1022.2483 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 99 | 17 | 2.2% |
| 100 | 17 | 2.2% |
| 106 | 14 | 1.8% |
| 111 | 14 | 1.8% |
| 129 | 14 | 1.8% |
| 125 | 14 | 1.8% |
| 105 | 13 | 1.7% |
| 95 | 13 | 1.7% |
| 108 | 13 | 1.7% |
| 102 | 13 | 1.7% |
| Other values (126) | 626 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 44 | 1 | 0.1% |
| 56 | 1 | 0.1% |
| 57 | 2 | 0.3% |
| 61 | 1 | 0.1% |
| 62 | 1 | 0.1% |
| 65 | 1 | 0.1% |
| 67 | 1 | 0.1% |
| 68 | 3 | |
| 71 | 4 |
| Value | Count | Frequency (%) |
| 199 | 1 | 0.1% |
| 198 | 1 | 0.1% |
| 197 | 4 | |
| 196 | 3 | |
| 195 | 2 | |
| 194 | 3 | |
| 193 | 2 | |
| 191 | 1 | 0.1% |
| 190 | 1 | 0.1% |
| 189 | 4 |
DiabetesPedigreeFunction
Real number (ℝ)
| Distinct | 517 |
|---|---|
| Distinct (%) | 67.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4718763 |
| Minimum | 0.078 |
|---|---|
| Maximum | 2.42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.0 KiB |
Quantile statistics
| Minimum | 0.078 |
|---|---|
| 5-th percentile | 0.14035 |
| Q1 | 0.24375 |
| median | 0.3725 |
| Q3 | 0.62625 |
| 95-th percentile | 1.13285 |
| Maximum | 2.42 |
| Range | 2.342 |
| Interquartile range (IQR) | 0.3825 |
Descriptive statistics
| Standard deviation | 0.3313286 |
|---|---|
| Coefficient of variation (CV) | 0.70215138 |
| Kurtosis | 5.5949535 |
| Mean | 0.4718763 |
| Median Absolute Deviation (MAD) | 0.1675 |
| Skewness | 1.9199111 |
| Sum | 362.401 |
| Variance | 0.10977864 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.254 | 6 | 0.8% |
| 0.258 | 6 | 0.8% |
| 0.261 | 5 | 0.7% |
| 0.207 | 5 | 0.7% |
| 0.259 | 5 | 0.7% |
| 0.238 | 5 | 0.7% |
| 0.268 | 5 | 0.7% |
| 0.304 | 4 | 0.5% |
| 0.299 | 4 | 0.5% |
| 0.692 | 4 | 0.5% |
| Other values (507) | 719 |
| Value | Count | Frequency (%) |
| 0.078 | 1 | |
| 0.084 | 1 | |
| 0.085 | 2 | |
| 0.088 | 2 | |
| 0.089 | 1 | |
| 0.092 | 1 | |
| 0.096 | 1 | |
| 0.1 | 1 | |
| 0.101 | 1 | |
| 0.102 | 1 |
| Value | Count | Frequency (%) |
| 2.42 | 1 | |
| 2.329 | 1 | |
| 2.288 | 1 | |
| 2.137 | 1 | |
| 1.893 | 1 | |
| 1.781 | 1 | |
| 1.731 | 1 | |
| 1.699 | 1 | |
| 1.698 | 1 | |
| 1.6 | 1 |
Age
Real number (ℝ)
| Distinct | 52 |
|---|---|
| Distinct (%) | 6.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33.240885 |
| Minimum | 21 |
|---|---|
| Maximum | 81 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.0 KiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 24 |
| median | 29 |
| Q3 | 41 |
| 95-th percentile | 58 |
| Maximum | 81 |
| Range | 60 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 11.760232 |
|---|---|
| Coefficient of variation (CV) | 0.35378816 |
| Kurtosis | 0.64315889 |
| Mean | 33.240885 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 1.1295967 |
| Sum | 25529 |
| Variance | 138.30305 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 22 | 72 | 9.4% |
| 21 | 63 | 8.2% |
| 25 | 48 | 6.2% |
| 24 | 46 | 6.0% |
| 23 | 38 | 4.9% |
| 28 | 35 | 4.6% |
| 26 | 33 | 4.3% |
| 27 | 32 | 4.2% |
| 29 | 29 | 3.8% |
| 31 | 24 | 3.1% |
| Other values (42) | 348 |
| Value | Count | Frequency (%) |
| 21 | 63 | |
| 22 | 72 | |
| 23 | 38 | |
| 24 | 46 | |
| 25 | 48 | |
| 26 | 33 | |
| 27 | 32 | |
| 28 | 35 | |
| 29 | 29 | |
| 30 | 21 | 2.7% |
| Value | Count | Frequency (%) |
| 81 | 1 | 0.1% |
| 72 | 1 | 0.1% |
| 70 | 1 | 0.1% |
| 69 | 2 | |
| 68 | 1 | 0.1% |
| 67 | 3 | |
| 66 | 4 | |
| 65 | 3 | |
| 64 | 1 | 0.1% |
| 63 | 4 |
Outcome
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.0 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 768 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 501 | |
| 1 | 267 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 501 | |
| 1 | 267 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 501 | |
| 1 | 267 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 768 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 501 | |
| 1 | 267 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 768 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 501 | |
| 1 | 267 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 768 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 501 | |
| 1 | 267 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| Glucose | BloodPressure | Insulin | BMI | DiabetesPedigreeFunction | Age | Outcome | |
|---|---|---|---|---|---|---|---|
| 0 | 72.238983 | 154.330247 | 32.352224 | 84.0 | 0.304 | 21.0 | 1 |
| 0 | 58.000000 | 190.000000 | 34.000000 | 98.0 | 0.430 | 43.0 | 1 |
| 1 | 82.000000 | 154.330247 | 28.200000 | 112.0 | 1.282 | 50.0 | 0 |
| 1 | 75.000000 | 154.330247 | 35.700000 | 112.0 | 0.148 | 21.0 | 0 |
| 2 | 46.000000 | 83.000000 | 28.700000 | 139.0 | 0.654 | 22.0 | 1 |
| 2 | 64.000000 | 154.330247 | 30.800000 | 108.0 | 0.158 | 21.0 | 1 |
| 3 | 50.000000 | 154.330247 | 21.900000 | 161.0 | 0.254 | 65.0 | 0 |
| 3 | 80.000000 | 154.330247 | 24.600000 | 107.0 | 0.856 | 34.0 | 0 |
| 4 | 80.000000 | 370.000000 | 46.200000 | 134.0 | 0.238 | 46.0 | 1 |
| 4 | 90.000000 | 154.330247 | 29.900000 | 136.0 | 0.210 | 50.0 | 1 |
| Glucose | BloodPressure | Insulin | BMI | DiabetesPedigreeFunction | Age | Outcome | |
|---|---|---|---|---|---|---|---|
| 604 | 52.000000 | 36.000000 | 27.8 | 74.0 | 0.269 | 22.0 | 1 |
| 605 | 64.000000 | 154.330247 | 34.2 | 111.0 | 0.260 | 24.0 | 0 |
| 606 | 74.000000 | 144.000000 | 36.1 | 138.0 | 0.557 | 50.0 | 1 |
| 607 | 88.000000 | 235.000000 | 39.3 | 126.0 | 0.704 | 27.0 | 0 |
| 608 | 76.000000 | 200.000000 | 35.9 | 122.0 | 0.483 | 26.0 | 0 |
| 609 | 64.000000 | 140.000000 | 28.6 | 139.0 | 0.411 | 26.0 | 0 |
| 610 | 122.000000 | 154.330247 | 22.4 | 96.0 | 0.207 | 27.0 | 0 |
| 611 | 86.000000 | 154.330247 | 45.6 | 101.0 | 1.136 | 38.0 | 1 |
| 612 | 72.238983 | 154.330247 | 42.4 | 141.0 | 0.205 | 29.0 | 1 |
| 613 | 96.000000 | 154.330247 | 22.5 | 125.0 | 0.262 | 21.0 | 0 |